4,894 research outputs found

    htsint: a Python library for sequencing pipelines that combines data through gene set generation

    Get PDF
    Background: Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. Results: We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. Conclusion: The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint

    Clustering Analyses of 300,000 Photometrically Classified Quasars--II. The Excess on Very Small Scales

    Get PDF
    We study quasar clustering on small scales, modeling clustering amplitudes using halo-driven dark matter descriptions. From 91 pairs on scales <35 kpc/h, we detect only a slight excess in quasar clustering over our best-fit large-scale model. Integrated across all redshifts, the implied quasar bias is b_Q = 4.21+/-0.98 (b_Q = 3.93+/-0.71) at ~18 kpc/h (~28 kpc/h). Our best-fit (real-space) power index is ~-2 (i.e., ξ(r)r2\xi(r) \propto r^{-2}), implying steeper halo profiles than currently found in simulations. Alternatively, quasar binaries with separation <35 kpc/h may trace merging galaxies, with typical dynamical merger times t_d~(610+/-260)m^{-1/2} Myr/h, for quasars of host halo mass m x 10^{12} Msolar/h. We find UVX quasars at ~28 kpc/h cluster >5 times higher at z > 2, than at z < 2, at the 2.0σ2.0\sigma level. However, as the space density of quasars declines as z increases, an excess of quasar binaries (over expectation) at z > 2 could be consistent with reduced merger rates at z > 2 for the galaxies forming UVX quasars. Comparing our clustering at ~28 kpc/h to a \xi(r)=(r/4.8\Mpch)^{-1.53} power-law, we find an upper limit on any excess of a factor of 4.3+/-1.3, which, noting some caveats, differs from large excesses recently measured for binary quasars, at 2.2σ2.2\sigma. We speculate that binary quasar surveys that are biased to z > 2 may find inflated clustering excesses when compared to models fit at z < 2. We provide details of 111 photometrically classified quasar pairs with separations <0.1'. Spectroscopy of these pairs could significantly constrain quasar dynamics in merging galaxies.Comment: 12pages, 3 figures, 2 tables; uses amulateapj; accepted to Ap

    lpEdit: an editor to facilitate reproducible analysis via literate programming

    Get PDF
    ArticleCopyright 2013 Adam J Richards et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited .There is evidence to suggest that a surprising proportion of published experiments in science are difficult if not impossible to reproduce. The concepts of data sharing, leaving an audit trail and extensive documentation are fundamental to reproducible research, whether it is in the laboratory or as part of an analysis. In this work, we introduce a tool for documentation that aims to make analyses more reproducible in the general scientific community. The application, lpEdit, is a cross-platform editor, written with PyQt4, that enables a broad range of scientists to carry out the analytic component of their work in a reproducible manner—through the use of literate programming. Literate programming mixes code and prose to produce a final report that reads like an article or book. lpEdit targets researchers getting started with statistics or programming, so the hurdles associated with setting up a proper pipeline are kept to a minimum and the learning burden is reduced through the use of templates and documentation. The documentation for lpEdit is centered around learning by example, and accordingly we use several increasingly involved examples to demonstrate the software’s capabilities. We first consider applications of lpEdit to process analyses mixing R and Python code with the LATEX documentation system. Finally, we illustrate the use of lpEdit to conduct a reproducible functional analysis of high-throughput sequencing data, using the transcriptome of the butterfly species Pieris brassica

    A Simple Likelihood Method for Quasar Target Selection

    Full text link
    We present a new method for quasar target selection using photometric fluxes and a Bayesian probabilistic approach. For our purposes we target quasars using Sloan Digital Sky Survey (SDSS) photometry to a magnitude limit of g=22. The efficiency and completeness of this technique is measured using the Baryon Oscillation Spectroscopic Survey (BOSS) data, taken in 2010. This technique was used for the uniformly selected (CORE) sample of targets in BOSS year one spectroscopy to be realized in the 9th SDSS data release. When targeting at a density of 40 objects per sq-deg (the BOSS quasar targeting density) the efficiency of this technique in recovering z>2.2 quasars is 40%. The completeness compared to all quasars identified in BOSS data is 65%. This paper also describes possible extensions and improvements for this techniqueComment: Updated to accepted version for publication in the Astrophysical Journal. 10 pages, 10 figures, 3 table

    Hadronic Resonances from Lattice QCD

    Full text link
    The determination of the pattern of hadronic resonances as predicted by Quantum Chromodynamics requires the use of non-perturbative techniques. Lattice QCD has emerged as the dominant tool for such calculations, and has produced many QCD predictions which can be directly compared to experiment. The concepts underlying lattice QCD are outlined, methods for calculating excited states are discussed, and results from an exploratory Nucleon and Delta baryon spectrum study are presented.Comment: 8 pages, VII Latin American Symposium on Nuclear Physics and Application

    Results and Frontiers in Lattice Baryon Spectroscopy

    Full text link
    The Lattice Hadron Physics Collaboration (LHPC) baryon spectroscopy effort is reviewed. To date the LHPC has performed exploratory Lattice QCD calculations of the low-lying spectrum of Nucleon and Delta baryons. These calculations demonstrate the effectiveness of our method by obtaining the masses of an unprecedented number of excited states with definite quantum numbers. Future work of the project is outlined.Comment: To appear in the proceedings for the VII Latin American Symposium of Nuclear Physics and Application
    corecore